AITopics | mixup and cutmix

A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective

Neural Information Processing SystemsDec-25-2025, 14:07:18 GMT

We propose the first unified theoretical analysis of mixed sample data augmentation (MSDA), such as Mixup and CutMix. Our theoretical results show that regardless of the choice of the mixing strategy, MSDA behaves as a pixel-level regularization of the underlying training loss and a regularization of the first layer parameters. Similarly, our theoretical results support that the MSDA training strategy can improve adversarial robustness and generalization compared to the vanilla training strategy. Using the theoretical results, we provide a high-level understanding of how different design choices of MSDA work differently. For example, we show that the most popular MSDA methods, Mixup and CutMix, behave differently, e.g., CutMix regularizes the input gradients by pixel distances, while Mixup regularizes the input gradients regardless of pixel distances. Our theoretical results also show that the optimal MSDA strategy depends on tasks, datasets, or model parameters. From these observations, we propose generalized MSDAs, a Hybrid version of Mixup and CutMix (HMix) and Gaussian Mixup (GMix), simple extensions of Mixup and CutMix. Our implementation can leverage the advantages of Mixup and CutMix, while our implementation is very efficient, and the computation cost is almost neglectable as Mixup and CutMix. Our empirical study shows that our HMix and GMix outperform the previous state-of-the-art MSDA methods in CIFAR-100 and ImageNet classification tasks.

mixup and cutmix, sample data augmentation, unified analysis, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective

Neural Information Processing SystemsAug-19-2025, 14:55:35 GMT

Using the theoretical results, we provide a high-level understanding of how different design choices of MSDA work differently.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Vision (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

e6f32e64b9c27d153b46c94f0fe22b56-Paper-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 14:55:32 GMT

cutmix, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

DropMix: Better Graph Contrastive Learning with Harder Negative Samples

Ma, Yueqi, Chen, Minjie, Li, Xiang

arXiv.org Artificial IntelligenceOct-15-2023

While generating better negative samples for contrastive learning has been widely studied in the areas of CV and NLP, very few work has focused on graph-structured data. Recently, Mixup has been introduced to synthesize hard negative samples in graph contrastive learning (GCL). However, due to the unsupervised learning nature of GCL, without the help of soft labels, directly mixing representations of samples could inadvertently lead to the information loss of the original hard negative and further adversely affect the quality of the newly generated harder negative. To address the problem, in this paper, we propose a novel method DropMix to synthesize harder negative samples, which consists of two main steps. Specifically, we first select some hard negative samples by measuring their hardness from both local and global views in the graph simultaneously. After that, we mix hard negatives only on partial representation dimensions to generate harder ones and decrease the information loss caused by Mixup. We conduct extensive experiments to verify the effectiveness of DropMix on six benchmark datasets. Our results show that our method can lead to better GCL performance. Our data and codes are publicly available at https://github.com/Mayueq/DropMix-Code.

dropmix, learning, negative sample, (13 more...)

arXiv.org Artificial Intelligence

2310.09764

Country: Asia > China > Shanghai > Shanghai (0.05)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective

Park, Chanwoo, Yun, Sangdoo, Chun, Sanghyuk

arXiv.org Artificial IntelligenceAug-21-2022

We propose the first unified theoretical analysis of mixed sample data augmentation (MSDA), such as Mixup and CutMix. Our theoretical results show that regardless of the choice of the mixing strategy, MSDA behaves as a pixel-level regularization of the underlying training loss and a regularization of the first layer parameters. Similarly, our theoretical results support that the MSDA training strategy can improve adversarial robustness and generalization compared to the vanilla training strategy. Using the theoretical results, we provide a high-level understanding of how different design choices of MSDA work differently. For example, we show that the most popular MSDA methods, Mixup and CutMix, behave differently, e.g., CutMix regularizes the input gradients by pixel distances, while Mixup regularizes the input gradients regardless of pixel distances. Our theoretical results also show that the optimal MSDA strategy depends on tasks, datasets, or model parameters. From these observations, we propose generalized MSDAs, a Hybrid version of Mixup and CutMix (HMix) and Gaussian Mixup (GMix), simple extensions of Mixup and CutMix. Our implementation can leverage the advantages of Mixup and CutMix, while our implementation is very efficient, and the computation cost is almost neglectable as Mixup and CutMix. Our empirical study shows that our HMix and GMix outperform the previous state-of-the-art MSDA methods in CIFAR-100 and ImageNet classification tasks.

cutmix, mixup and cutmix, msda, (11 more...)

arXiv.org Artificial Intelligence

2208.09913

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Filters

Collaborating Authors

mixup and cutmix

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective

A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective

e6f32e64b9c27d153b46c94f0fe22b56-Paper-Conference.pdf

DropMix: Better Graph Contrastive Learning with Harder Negative Samples

A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective